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1.  Introduction 

Recent  neurophysiological  experiments  have  shed  new  light  onto  how  various 
sound  features  are  encoded  and  organized  in  the  primary  auditory  cortex  (AI).  One  such 
feature  is  the  envelope  of  broadband  acoustic  spectra,  or  spectral  profile,  the  most 
important  physical  correlate  of  timbre  (Plomp  1976).  To  determine  how  AI  units 
represent  complex  dynamic  profiles,  it  is  essential  to  measure  their  spectro-temporal 
response  field  (STRF).  This  function  is  analogous  to  the  receptive  field  of  visual  neurons 
in  that  it  reflects  the  strength  and  dynamics  of  the  unit  responses  to  tones  at  different 
frequencies.  A  more  traditional  response  measure  of  auditory  units  is  the  “response  area”, 
defined  roughly  as  the  range  of  tone  frequencies  and  intensities  that  just  elicit  excitatory 
or  inhibitory  responses.  The  response  area  is  only  useful  as  a  qualitative  predictor  of  a 
unit’s  responses  to  arbitrary  broadband  spectra;  its  measurements  is  also  significantly 
affected  by  a  host  of  experimental  difficulties  and  nonlinear  factors  that  render  estimates 
of  parameters  such  as  bandwidth  and  asymmetry  quantitatively  inaccurate  (Shamma 
1993,  Shamma  et  al.  1995a,  Nelken  et  al.  1994). 

To  circumvent  some  of  these  problems,  we  have  used  new  techniques  to  measure 
the  spectral  and  dynamic  properties  of  response  areas  in  AI  (Schreiner  and  Calhoun  1995, 
Shamma  et  al.  1995a).  The  stimuli  and  techniques — adapted  from  vision  research  (De 
Valois  1990)  and  from  psychoacoustic  studies  (Green  1986,  Hillier  1991,  Summers  and 
Leek  1994) — apply  linear  system  theory  to  measure  the  response  area  of  cortical  units. 
Specifically,  they  employ  broadband  spectra  with  sinusoidally  modulated  profiles  against 
the  logarithmic  frequency  axis — or  ripples — shown  in  Fig.l.  By  varying  the  ripple 
density  (or  frequency),  amplitude,  and  drift  velocity,  one  can  measure  a  transfer  function 
to  such  rippled  spectra,  and  from  it  by  an  inverse  Fourier  transform  obtain  the  STRF. 

A  fundamental  assumption  of  these  techniques  (reviewed  briefly  below)  is  that  the 
responses  to  such  broadband  stimuli  are  substantially  linear  with  respect  to  spectral 
profiles,  that  is  they  satisfy  the  superposition  principle.  This  principle  means  that  if  a 
complex  spectral  profile  is  decomposed  into  a  sum  of  several  simple  ripple  spectra,  then 
the  unit  response  to  the  complex  profile  must  equal  the  sum  of  the  responses  to  each  of 
these  ripples.  Superpostion  has  been  validated  using  both  stationary  (Shamma  et  al. 
1995a,b)  and  dynamic  spectra  that  can  be  decomposed  into  any  combination  of 
downward  moving  ripples  (Kowalski  et  al.  1996a,  b). 


Report  Documentation  Page 

Form  Approved 

OMB  No.  0704-0188 

Public  reporting  burden  for  the  collection  of  information  is  estimated  to  average  1  hour  per  response,  including  the  time  for  reviewing  instructions,  searching  existing  data  sources,  gathering  and 
maintaining  the  data  needed,  and  completing  and  reviewing  the  collection  of  information.  Send  comments  regarding  this  burden  estimate  or  any  other  aspect  of  this  collection  of  information, 
including  suggestions  for  reducing  this  burden,  to  Washington  Headquarters  Services,  Directorate  for  Information  Operations  and  Reports,  1215  Jefferson  Davis  Highway,  Suite  1204,  Arlington 

VA  22202-4302.  Respondents  should  be  aware  that  notwithstanding  any  other  provision  of  law,  no  person  shall  be  subject  to  a  penalty  for  failing  to  comply  with  a  collection  of  information  if  it 
does  not  display  a  currently  valid  OMB  control  number. 

1.  REPORT  DATE 

2.  REPORT  TYPE 

3.  DATES  COVERED 

00-00-1997  to  00-00-1997 

4.  TITLE  AND  SUBTITLE 

Representation  of  Complex  Dynamic  Spectra  in  Auditory  Cortex 

5a.  CONTRACT  NUMBER 

5b.  GRANT  NUMBER 

5c.  PROGRAM  ELEMENT  NUMBER 

6.  AUTHOR(S) 

5d.  PROTECT  NUMBER 

5e.  TASK  NUMBER 

5f.  WORK  UNIT  NUMBER 

7.  PERFORMING  ORGANIZATION  NAME(S)  AND  ADDRESS(ES) 

University  of  Maryland, Department  of  Electrical  Engineering  and 
Computer  Engineering, Institute  for  Systems  Research, College 

Park, MD, 20742 

8.  PERFORMING  ORGANIZATION 

REPORT  NUMBER 

9.  SPONSORING/MONITORING  AGENCY  NAME(S)  AND  ADDRESS(ES) 

10.  SPONSOR/MONITOR'S  ACRONYM(S) 

11.  SPONSOR/MONITOR'S  REPORT 
NUMBER(S) 

12.  DISTRIBUTION/AVAILABILITY  STATEMENT 

Approved  for  public  release;  distribution  unlimited 

13.  SUPPLEMENTARY  NOTES 

14.  ABSTRACT 

15.  SUBIECT  TERMS 

16.  SECURITY  CLASSIFICATION  OF:  17.  LIMITATION  OF 

_ _ _  ABSTRACT 

18.  NUMBER  19a.  NAME  OF 

OF  PAGES  RESPONSIBLE  PERSON 

a.  REPORT  b.  ABSTRACT  c.  THIS  PAGE  Same  3S 

unclassified  unclassified  unclassified  Report  (SAR) 

7 

Standard  Form  298  (Rev.  8-98) 

Prescribed  by  ANSI  Std  Z39-18 


Natural  spectra  such  as  speech,  music,  and  various  natural  sounds  are  composed  of 
both  downward  and  upward  moving  ripples.  Consequently,  linearity  (or  superposition) 
must  be  validated  for  both  directions  of  moving  ripples.  Furthermore,  if  linearity  holds, 
we  can  derive  unit  STRFs  from  complete  measurements  of  the  transfer  functions  for 
ripples  moving  in  both  directions,  followed  by  a  2-dimensional  inverse  Fourier  transform. 


Ripple  Spectrogram  STRF 
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Figure  1 

Computing  responses  with  STRF.  Left:  The  ripple  spectral  profile  consists  of  101  tones  equally  spaced  along  the 
log.  frequency  axis,  spanning  5  octaves  (e.g.,  0.25-8  kHz).  The  sinusoidally  modulated  envelope  has  a  ripple 
density  or  frequency  (Q.)  given  in  units  of  cycles/octave;  its  constant  velocity  tn  is  defined  as  the  number  of  ripple 
cycles  traversing  the  lower  edge  of  the  spectrum  per  second  (Hz).  Middle:  The  response  of  a  unit  is  deduced 
from  its  STRF  shown  here  against  the  tonotopic  axis  with  white  representing  postive  amplitudes  and  black 
negative  ones.  Note  the  STRF  is  a  function  of  time,  and  can  be  convolved  with  the  input  spectrogram  to  compute 
the  expected  response  on  the  Right. 


When  dealing  with  both  the  temporal  and  spectral  aspects  of  the  responses,  an 
important  issue  is  the  separability  of  these  two  dimensions,  i.e.,  whether  these  response 
properties  can  be  measured  independently  of  each  other.  Without  separability,  transfer 
functions  must  be  measured  at  all  combinations  of  velocities  and  ripple  densities,  an 
impractical  proposition  given  the  extended  times  needed  to  hold  a  unit.  With  separabiltiy, 
it  is  sufficient  instead  to  measure  the  temporal  transfer  function  at  one  ripple  density,  and 
the  spectral  transfer  function  at  one  velocity;  the  complete  transfer  function  is  then  taken 
as  the  product  of  these  two  one-dimensional  transfer  functions. 

In  previous  investigations  using  downward  moving  ripples  (Kowalski  et  al. 
1996a,b),  it  was  demonstrated  that  the  temporal  and  spectral  transfer  functions  are  indeed 
separable.  This  is  because  predictions  of  responses  to  various  combinations  of  rippled 
spectra  could  be  made  from  transfer  functions  measured  at  the  best  ripple  velocity  and 
ripple  density  only.  Here,  the  notion  of  separability  is  extended  to  take  into  account  both 
directions  of  moving  ripples.  To  validate  it,  we  shall  assume  that  the  complete  spectro- 
temporal  transfer  function  is  quadrant  separable,  i.e.,  is  the  product  of  transfer  functions 
measured  at  one  ripple  velocity  and  density,  in  each  direction.  We  then  proceed  to  derive 
the  corresponding  unit’s  STRF,  and  to  predict  the  responses  to  various  combination  of 
complex  dynamic  spectra  composed  of  ripples  moving  in  both  directions.  Fair  predictions 
of  responses  to  these  spectra  is  taken  as  a  validation  of  both  the  linearity  and  separability 
of  the  system. 


2.  Methods 


The  data  used  here  were  collected  from  AI  of  2  Ketamine/Xylazine  anesthetized 
domestic  ferrets  ( Mustela  putorius).  Details  of  the  surgery  are  as  in  (Shamma  et  al  1993). 

Moving  ripples  can  be  used  to  measure  the  temporal  and  ripple  transfer  function  of 
cells,  and  hence  derive  their  STRFs  (Kowalski  et  al  1996a,b).  The  basic  procedure  is 
summarized  here  for  ripple  transfer  function  measurements.  Fig. 2  illustrates  the  response 
to  moving  ripples  at  a  fixed  Q)=  8  Hz,  and  ripple  frequency  Cl  from  -1.6  to  +1.6 
cycles/octave  (Fig.2A),  i.e.  upwards  and  downwards  moving  ripples.  The  magnitude  and 
phase  of  the  synchronized  response  at  each  Cl  is  derived  and  plotted  as  the  magnitude  and 
phase  of  the  ripple  transfer  function  7^(£2)(=|7^(£2)|eJV‘'o,'“0  (Fig.2B).  The  temporal 
transfer  function  T^(a>)  are  measured  similarly,  with  the  result  shown  in  Fig  2C. 
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Figure  2 

Transfer  function  measurements  using  moving  ripples.  A:  Raster  responses  to  a  ripple  moving  at  ft)  =  8  Hz  with 
ripple  frequencies  £2  =  -1.6  to  1.6  cycle/octave  (or,  equivalently,  ripples  moving  at  ft)  =  8  Hz  with  ripple 
frequencies  £2  =  0  to  1.6  cycle/octave,  and  ripples  moving  at  ft)  =  -8  Hz  with  ripple  frequencies  £2  =  0.2  to  1.6 
cycle/octave).  B,  C:  The  amplitude  and  phase  of  the  fitted  responses  as  a  function  of  £2  and  ft).  A  straight  line  fit 
of  the  phase  data  points  is  also  shown. 

The  transfer  functions  Ta(Cl)  and  Tn(co)  are  valid  at  one  ripple  frequency  or 
velocity,  as  indicated  by  the  subscripts  0)  and  Cl.  In  principle,  one  must  measure  the  two 
dimensional  transfer  function  at  all  Cl  and  0)  within  each  quadrant  and  compute  a  two 
dimensional  STRF  from  the  inverse  transform.  However,  if  we  assume  that,  apart  from  a 
scale  change,  these  functions  remain  unchanged  over  the  most  responsive  range  of  Cl  and 
0) ,  the  temporal  and  ripple  transfer  functions  can  be  treated  as  separable  from  each  other, 
and  measured  at  one  Cl  and  (0 ,  respectively. 


There  is  a  stronger  (and  more  strict)  notion  of  separability,  independent  of  direction, 
i.e.,  applying  uniformly  across  all  quadrants  (McLean  and  Palmer  1994,  Watson  and 
Ahumada  1985).  This  full  separability  occurs  if  the  responses  to  downward  and  upward- 
moving  ripples  are  identical,  implying  a  symmetric  transfer  function  about  the  (o  axis, 
and  a  symmetric  STRF.  These  issues  have  already  been  the  subject  of  theoretical  and 
experimental  studies  in  visual  cortex,  where  quadrant-separability  and  a  significant 
measure  of  linearity  have  been  demonstrated  (McLean  and  Palmer  1994). 

For  a  fully  separable  unit,  the  transfer  function  is  a  direct  product 
T{CL,co)  =  T(Q.)T(co)  =  |r(f2)|ei'1''“j7\ft))|ei'i''‘u'.  For  a  unit  that  is  only  quadrant  separable, 
the  transfer  function  is  a  sum  of  direct  products  for  the  right  quadrants  on  the  Cl-co  plane: 
T(Q(D)  =  T  (LI)T  (oj).  ±0)>  0,(1  >  0.  Fig. 3  illustrates  the  full  STRF  obtained  from  the 
inverse  Fourier  transform  of  the  compound  transfer  function  T(Q,co)  for  three  units. 
Apparent  structure  in  the  plot  far  away  from  the  main  body  of  the  STRF  are  simply  due 
to  aliasing  and  noise  effects.  The  STRF  example  in  the  middle  panel  is  asymmetric  with 
strong  inhibition  from  the  high  frequency  side.  The  unit  in  the  third  panel  is  almost  fully 
separable,  as  can  be  seen  from  the  spectral  symmetry  of  its  STRF  at  all  times. 
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Figure  3 

Examples  of  STRFs.  Left:  Response  field  computed  from  the  inverse  fourier  transform  of  the  full  two- 
dimensional  transfer  function  of  the  unit  in  Figs. 2  and  3.  Middle  and  Right:  Two  more  examples  of  a  highly 
asymmetric  STRF  (middle)  and  a  relatively  broad  bandwidth  and  slow  symmetric  STRF  (right) 


A  periodic  stimulus  composed  of  a  linear  sum  of  ripples  of  different  ripple 
frequencies  and  velocities  is  predicted  to  give  a  response  which  is  a  linear  sum  of 
sinusoids  with  amplitudes  and  phases  weighted  by  the  transfer  fucntion 
T(£l,co)  =  |7’(£2,ty)|eri''“’u'\  Applying  such  superposition  to  predict  responses  to  complex 
stimuli  is  only  meaningful  if  the  system  is  essentially  linear;  so  accurate  predictions  are  a 
verification  of  linearity.  Furthermore,  the  transfer  function  is  calculated  under  the 
assumption  of  (quadrant)  separability,  so  accurate  predictions  are  also  a  verification  of 
separability. 

Predictions  can  equivalently  be  made  by  a  simple  convolution  of  the  STRF  with  the 
spectrogram  of  the  stimulus  as  illustrated  in  Fig. 4.  In  each  case,  the  predicted  waveform 
is  plotted  together  with  the  response  for  visual  comparison.  The  scale  of  the  predicted 
waveform  is  arbitrary;  however,  its  (zero)  baseline  is  set  at  the  rate  of  firing  of  the 
stationary  stimulus  at  co= 0  in  Fig.2C.  Three  more  examples  of  predictions  with  multiple 
ripples  are  shown  in  Fig.4B. 
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Figure  4 

Predictions  of  the  responses  to  complex  dynamic  spectra  using  the  STRF.  (A)  The  predicted  response  is 
computed  by  a  convolution  (along  the  time  dimension)  of  the  STRF  with  the  spectrogram.  The  stimulus  shown  is 
composed  of  two  ripples  (0.4  c/o  at  12  and  -4  Hz).  Same  cell  as  in  Fig. 2.  The  predicted  waveform  (dashed  line) 
is  juxtaposed  against  the  actual  response  (solid  line)  over  one  period  of  the  stimulus.  (B)  Three  additional 
examples  of  predictions  using:  (right  and  left  panels)  three  ripples  (0.4  c/o  at  12,-4  Hz,  and  0.2  c/o  at  -8  Hz)  in  2 
units;  (middle  panel)  two  ripples  (0.6  c/o  at  -8  Hz,  and  0.2  c/o  at  24  Hz)  for  the  same  unit  as  the  left  panel. 


3.  Results 

Recordings  so  far  have  been  made  from  70  units  in  2  animals.  Many  different  types 
of  STRFs  have  been  observed  as  illustrated  in  Fig. 3.  They  include  symmetric  (left  and 
right  panels)  and  asymmetric  (middle  panel),  slow  (right  panel)  and  fast  dynamics  (left 
panel).  We  have  not  obtained  yet  an  large  enough  sample  of  units  to  describe  the 
statistical  distribution  of  these  properties  in  AI. 

AI  responses  to  spectra  composed  of  a  few  ripples  can  be  reasonably  well  predicted 
from  responses  to  single  ripples  by  applying  the  superposition  principle.  This  is  shown 
earlier  in  Fig. 4  using  2  and  3  ripple  stimuli.  Fig. 5  illustrates  additional  examples  on 
dynamic  spectra  composed  of  up  to  16  moving  ripples  with  different  velocities,  phases, 
and  ripple  frequencies.  In  all  cases,  predicted  responses  assuming  linearity  and  quadrant- 
separability,  compare  well  with  those  measured  experimentally. 


4.  Discussion 

Linearity  and  separability  are  validated  by  the  successful  prediction  of  responses 
using  the  superposition  principle  and  spectro-temporal  transfer  functions  measured  at  one 
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velocity  and  ripple  density.  These  properties  have  allowed  us  to  derive  STRFs  for  all 
units  recorded,  illustrating  their  diverse  and  complex  structure. 


Stimulus  Spectrogram  Spectro-Temporal  RF 
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Figure  5 

Predicting  the  final  response  to  large  combinations  of  moving  ripples  in  the  same  unit.  A:  Spectrograms  of  the 
stimulus,  along  with  the  ripple  content.  B:  STRF  of  the  cell  (computed  as  described  in  Fig. 5).  All  other  details  of 
the  figure  are  as  in  Fig.4. 


Linearity  and  separability  simplify  enormously  the  measurement  and  prediction  of 
responses  to  complex  dynamic  spectra  and  reflect  certain  fundamental  operational 
principles  of  the  auditory,  and  perhaps  other  sensory  systems.  The  persistence  of  response 
linearity  to  broadband  stimuli  suggest  that  the  operative  nonlinearities  are  those  that  do 
not  destroy  the  basic  linear  character  of  the  responses,  such  as  threshold,  half-wave 
rectification,  and  saturation  in  the  early  auditory  stages.  These  limit  significantly  the 
dynamic  range  of  the  linear  responses  to  ripple  stimuli,  but  do  not  severely  distort  the 
overall  gross  shape  of  the  responses  (Shamma  et  al,  1995a;  Kowalski  et  al,  1996a).  It 
should  be  emphasized  here,  however,  that  all  ripple  stimuli  are  broadband  in  nature,  and 
hence  response  linearity  cannot  be  confirmed  for  narrowband  stimuli  such  as  stationary, 
AM,  or  FM  tones. 

At  first  glance,  the  finding  that  separability  holds  for  complex  dynamic  spectra  is 
surprising,  given  the  intertwining  of  temporal  and  spectral  processing  along  the  auditory 
pathway  up  to  the  cortex.  One  possible  implication  of  this  finding  is  that  cortical 
temporal  and  spectral  processing  occur  as  two  essentially  separate  sequential  stages.  In 
such  a  model,  the  first  stage  would  have  a  purely  spectral  transfer  function,  followed  by 
temporal  filtering  in  the  second  stage.  The  overall  spectro-temporal  transfer  function 


would  then  be  the  product  of  the  two  transfer  functions.  This  model  is  plausible  if  one 
assumes  that  response  area  shape  (or  the  spectral  cross-section  of  the  STRF),  e.g., 
asymmetry  and  bandwidth,  is  due  to  the  organization  of  the  thalamic  (MGB)  input 
projections  to  the  AI  or  earlier  stages.  The  slow  temporal  responses  (rates  mostly  under 
12  Hz)  are  likely  to  be  related  to  the  cortico-thalamic  feedback  loops.  It  is  also  likely  that 
visual  and  other  sensory  pathways  share  the  linearity  and  separability  since  none  of  the 
above  physiological  features  are  unique  to  AI. 
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